Transfer Value Iteration Networks
نویسندگان
چکیده
منابع مشابه
Value Iteration Networks
We introduce the value iteration network (VIN): a fully differentiable neural network with a ‘planning module’ embedded within. VINs can learn to plan, and are suitable for predicting outcomes that involve planning-based reasoning, such as policies for reinforcement learning. Key to our approach is a novel differentiable approximation of the value-iteration algorithm, which can be represented a...
متن کاملGeneralized Value Iteration Networks: Life Beyond Lattices
In this paper, we introduce a generalized value iteration network (GVIN), which is an end-to-end neural network planning module. GVIN emulates the value iteration algorithm by using a novel graph convolution operator, which enables GVIN to learn and plan on irregular spatial graphs. We propose three novel differentiable kernels as graph convolution operators and show that the embedding-based ke...
متن کاملSoft Value Iteration Networks for Planetary Rover Path Planning
Value iteration networks are an approximation of the value iteration (VI) algorithm implemented with convolutional neural networks to make VI fully differentiable. In this work, we study these networks in the context of robot motion planning, with a focus on applications to planetary rovers. The key challenging task in learningbased motion planning is to learn a transformation from terrain obse...
متن کاملValue iteration and optimization of multiclass queueing networks
This paper considers in parallel the scheduling problem for multi class queueing networks and optimization of Markov decision processes It is shown that the value iteration algorithm may perform poorly when the algo rithm is not initialized properly The most typical case where the initial value function is taken to be zero may be a particularly bad choice In contrast if the value iteration algo...
متن کاملFactored Value Iteration Converges
In this paper we propose a novel algorithm, factored value iteration (FVI), for the approximate solution of factored Markov decision processes (fMDPs). The traditional approximate value iteration algorithm is modified in two ways. For one, the least-squares projection operator is modified so that it does not increase max-norm, and thus preserves convergence. The other modification is that we un...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the AAAI Conference on Artificial Intelligence
سال: 2020
ISSN: 2374-3468,2159-5399
DOI: 10.1609/aaai.v34i04.6022